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INTRODUCTION 


The  two-stage  least  square  method  of  linear  estimation 
of  coefficients  was  developed  by  Theil  (1958).     Basmann  (1957) 
independently  developed  a  similar  solution  under  the  name  of 
the  Generalized  Classical  method  of  linear  estimation  which 
leads  to  equivalent  estimators.     Several  studies  have  been  con- 
ducted more  recently  on  the  effectiveness  oT  the  method  and  its 
limitations . 

The  two-stage  least  square  method  was  developed  to 
replace  existing  methods   (Indirect  least  squares ,  Least  variance 
ratio,  or  Limited-information  single  equation)  by  providing  a 
method  of  more  general  applicability  while  being  less  expensive 
to  apply. 

The  problem  to  which  this  method  is  to  be  applied 
differs  from  single  equation  models  in  two  ways.     First,  although 
interest  may  center  on  a  single  equation  of  a  system,  the  entire 
system  of  relations  is  considered  simultaneously  to  obtain  a 
solution.     Second,  in  many  instances  the  problem  may  be  to 
estimate  all  the  parameters  in.  a  model  and  to  make  predictions 
from  the  complete  model.     In  other  words  although  it  may  be 
required  to  analyze  only  a  single  equation  of  the  model,  infor- 
mation from  the  entire  system  is  included  in  the  solution.  In 
a  system  of  equations  the  variables  may  be  classified  as  endo- 
genous and  exogenous.     Endogenous  variables  are  those  whose 
values  are  determined  by  the  simultaneous  interaction  of  the 
relations  in  the  model  and  exogenous  variables  are  those  whose 
values  are  independent  of  those  used  in  the  model. 

(1) 


In  the  first  stage  of  a  two-stage  least  squares  solu- 
tion ordinary  least  squares  methodology  is  applied  to  the  entire 
system  of  predetermined  variables  to  obtain  estimates  for  the 
endogenous  variables.     In  the  second  stage,  these  estimates 
are  substituted  in  the  system  and  ordinary  least  squares  is 
applied  again  to  a  particular  equation  or  a  set  of  equations 
of  the  system  to  estimate  the  required  parameters. 

The  method  is  not  only  relatively  expeditious  in  terms 
of  time  required  for  calculations  but  the  estimates  derived  can 
be  shown  to  be  asymptotically  unbiased,  consistent,  and  minimum- 
variance  . 


(2) 


DEFINITION  OF  THE  PROBLEM  AND  ASSOCIATED  TERMS 

In  a  system  of  linear  equations  containing  G  endogenous 
variables  ylt, . . . ,yQt  and  K  exogenous  variables  xlt > • • • >xKt' 
a  typical  set  of  equations  may  be  written  as: 


H 


(l)  y 


EB      v.,   +     Ey,.x.^  +  e.,      ,  t=l,...,T 

It      j.^11     lt      i=l1:L  xt  it 


where 


H-l  <  G;  K*  <  K 


The  quantities  H-l  and  K*  Indicate  the  number  of  variables  of 
the  system  present  In  this  particular  set  of  equations.  This 
set  of  equations  may  in  turn  be  written  in  matrix  form  as: 


where 
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(3) 


The  stochastic  errors  are  assumed  to  have  zero 

means  and  finite  variances  and  covariances : 


E<eitV  -  °ij  1.J  -1.2.....0 


a         <     «°  s,t  =  1,2, .  .  .  ,T 


E(e.   e.   )   -  0  s  A  t 

it  js 

for  all  s,t 

E(x     e     )  =  0  k  =  1,2, . . .  ,K 

kt  is 


To  illustrate  the  basis  for  not  applying  the  simple 
least  squares  method  to  this  system  consider  a  simple  example: 


(2)  yit  =  *  +  py2t  +  et 


(3)     y2t  =  ylt  +  Xt 


Substituting  (2)  in  (3)  yields 


+  W2t  +  Xt  +  V 


or    y0i.  =       +  _JL_  x^  +  ^t_ 
^     i-B     i-e   r  1-3, 


so  that  y      is  seen  in  general  to  be  influenced  by  e  . 

E(y2t)  =  _!L_  +  _  i_  X 
2t         1-6  +  1-6  t 


or      E{et  [y2t  -  E(y2t)]}  -     1  E(et  )  ¥■  0. 

1-  6 


The  random. error  term  and  the  explanatory  variable,  . 
in  the  equation  are  thus  correlated  and  the  direct  application 


(4) 


of  the  simple  least  squares  method  will  not  yield  unbiased 
estimates . 

DEVELOPMENT  OP  TWO-STAGE  LEAST  SQUARES  METHOD 

Since  the  random  error  vector  and  explanatory  variables 
cannot  be  assumed  independent,  the  number  of  methods  of  estima- 
tion are  reduced.     However,  the  mutual  relationships  which  are 
the  cause  of  the  complication,  are  used  to  make  estimation 
possible . 

Consider  the  system  of  simultaneous  linear  equations 
outlined  in  the  previous  section  and  assume  this  system  contains 
K  >  K*  predetermined  variables,  each  of  which  assumes  non- 
stochastic  values.     Under  these  conditions  it  is  possible  to 
base  the  estimation  method  on  the  independence  of  the  random 
error  vector,  e_,  and  the  set  of  all  predetermined  variables  in 
the  entire  system.     More  precisely: 

Equation  (1)  is  one  of  a  system  of  G  >  H  stochastic 
linear  equations  in  G  jointly  dependent  and  K  _>  Ks  +  H-l  pre- 
determined variables.     This  system  can  be  solved  for  the  jointly 
dependent  variables. 

Using  the  above  assumptions  it  is  now  possible  to  apply 

simple  least  squares  to  the  system  using  the  X  matrix  of  values 

for  all  predetermined  variables,  Xx  being  a  submatrix  of  X. 

From  the  above  assumption,  all  jointly  dependent  variables  can 

be  written  as  stochastic  linear  functions  of  X. 

s 

Applying  simple  least  squares,  extimators  for  Y^,  the 
vector  of  (H-l)  dependent  variables,  are  obtained  and  are  written 


in  matrix  form  as 


Y2  =  X  (X'X)  _1X^Y2 


which  may  be  rewritten  as 
Y 


2  =  X  (X'X)     1X'Y2  +  V 


where  V  denotes  the  matrix  of  reduced-form  residuals  for  the 
(H-l)  dependent  variables  appearing  on  the  right-hand  side  of 
the  equation  ( 1) . 

Now  equation  (1)  may  be  rewritten  as: 

y1  =  (I2  -  V)  £'2  +  _XS  X^  +   (e  +  V32) 


[(Y2  -  V)  X*] 


1* 


+  (e  +  VB2) 


Applying  simple  least  squares  to  this  relation  gives 


(A'A)  A'y. 


where  A  =   [(Y_2  -  V)   Xs  ] 


now  (Y  -  V)  '  (Y2  -  V)  =  Y2Y2  -  V'Y_2  -  Y^V  +  V'V 
and  V'Y2  =  V*(£2  +  V)  =  V*£2  +  V'V 


Since  it  is  a  property  of  the  least  squares  fit  that  the  residual 

A 

is  uncorrelated  with  the  estimate  value;  it  follows  that  V'Y^  =  0 
This  yields  the  result: 


(6) 


Tl2  =  1^-- 


Similarly 


Y'V  -  V'V. 
-2-  


Hence 


do  -  z>  -  (i0  -  v)  =  r2j_2  -  V'V. 


Moreover, 


X'V  =  X'  [Y     -  X(X'X)  XX'Y2] 


X'Y_2  -  X'X.2 


0, 


which  illustrates  the  property  of  least  squares  that  the  random 
error  vector  is  uncorrelated  with  the  explanatory  values.  Since 
X'V  equals  zero,  X*  V  must  equal  zero  since  X^  is  a  submatrix 
of  X. 


Returning  to  the  solution  for 


2 


1* 


and  using  the  above  results,  the  two-stage  estimator  may  be 


written 

"  (Y2-V)'(Y2-V) 

(y2-D  'x*" 

-1 

-Y-V^ 

X*'(Y2-V) 

x*  xx 

* Y2Y2-V'V 

x*T2 

-1 

"i2-r 

(7) 


The  need  for  the  inequality  K  >_  H-l  +  K*  is  now  ap- 
parent.    The  matrix  A  defined  above  is  of  order  T  by  H-l  +  K* . 
Hence  A/ A  is  a  square  symmetric  matrix  of  order  H-l  +  K*  and 
,(A'A)  =e(A).  Now 

_1  I 

A  =   [(Y2  -  V)  X*]  =  I  C(X'X)   l'Y2  0  , 

where  I  is  the  identity  matrix  of  order  KK  and  0  is  the  null 
matrix  of  order  K-K*  by  K*.     Thus  the  rank  of  A  cannot  be  great 
er  than  the  rank  of  X,  which  is  K.     If  the  rank  of  A  is  less 
than  H-l  +  K*,  then  A 'A  is  singular  and  no  solution  can  be 
obtained.     This  will  happen  if  K  <  H-l  +  K* .     However,  since 
the  inequality  K  _>  H-l  +  K*  was  originally  assumed  to  hold,  a 
solution  is  assured. 

The  two-stage  least  square  method  of  estimation  may 
be  justified  in  the  following  heuristic  manner.     If  the  parent 
reduced-form  random  error  distribution  corresponding  to  Y,  say 
V,  were  known;  simple  least  squares  could  be  applied  to 

y  =  (X2-v)  s2'  +  \  Tl;  +  (e-vp2). 

In  this  case  the  objection  that  some  of  the  right-hand  variable 
are  not  independent  of  the  random  error  vector  is  no  longer 
valid;   (Y-V)  being  an  exact  linear  function  of  X  and  hence 
non-stochastic.     The  matrix  V  is  not  known,  but  it  can  be 
estimated  by  means  of  V,  and  the  sampling  error  tends  to  zero 
for  Increasing  T  under  appropriate  conditions.     The  primary 
conditions  being  that  the  assumptions  made  previously  about  the 
random  error  vector  are  true.     Applying  least  squares  as  before 


(8) 


the  estimates  are  again: 


A 

A 

.V 

n2-n  ^xx 


PROPERTIES  OF  THE  ESTIMATORS 


By  defining  the  sampling  error  as 


A  . 

_Yl*-_ 

and  using  the  above  equations  it  is  apparent 

-1 


Y  'Y  -V'V 
-2-2  


Y'-V 


It  is  seen  that,  under  the  previous  assumptions  on 

e  and  X,  the  estimator  is  not  unbiased  for  finite  samples. 

However,  it  is  asymptotically  unbiased,  lirn  E  (y_)  =  0_,  provided 

that  each  row  of  Y-V  is  asymptotically  an  exact  and  non-stochastic 

linear  function  of  the  corresponding  row  of  X.     This  involves 

the  assumption  of  consistent  reduced-form  estimation;  hence: 

For  each  pair  t,f  (=1,  ,T)  and  for  each  pair  z3z"  (=1,...,G), 

the  parent  reduced-form  random  errors  v     (t)  and  v  ,{t')  corre- 

z  z 

sponding  to  the  right  hand  variables  y^  and  yz^3  respectfully, 
of  equation  (1)  have  zero  mean  and  satisfy 


(9) 


E[vz(t)  vB.(t')]     -  rzz*  if  t=f 


=  0  if 

a     „  being  independent  of  t  and  t'. 
zz 

This  assumption  is  satisfied  as  soon  as  each  of  the 
G  original  equations  of  the  system  have  random  errors  that 
satisfy  a  similar  condition. 

In  order  to  calculate  the  asymptotic  standard  errors, 
it  is  necessary  to  find 


lim  E(Tyy') 


lim   (  T 


Y'Y-V'V  Y'X, 
-2-2  ~2-J 


-1 

\-i 

e  e 

X 

— *. 

Y'Y  -V'V  Y'X^ 
-2~2   -2~* 


X'Y  X'X 


-1 


Since  (e  e')  equals  o2 


and 


Y-V 


Y^Y-V'V  Y^X* 


and  the  result  is 


ol  plim  T 


Y'Y-V'V  Y'X 
-2-2  -2~* 


X'Y  X'X 
— *— 2        — *— * 


-1 


Thiel  (1958)  has  shown  that  although  two-stage  least 
squares  has  a  larger  variance  than  ordinary  squares;  if  the  bias 
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of  ordinary  least  squares  Is  corrected,  two-stage  least  squares 
becomes  the  smaller.     In  fact,  he  has  shown  his  method  to 
be  a  minimum  variance  unbiased  estimator  under  the  given  assump- 
tions.    Basman  (1957)  in  his  development  of  the  method  shows 
the  estimators  are  best  linear  unbiased  in  addition  to  being 
consistent . 

GEOMETRIC  INTERPRETATION ' 

In  order  to  further  clarify  the  method  of  two-stage 
least  squares  the  following  geometric  picture  is  presented. 
Consider  a  T  dimensional  cartesian  space;  along  the  first  axis 
measure  the  first  observation  on  each  variable,  along  the  second 
axis  the  second  observation,  etc.     The  values  assumed  by  each 
variable  are  then  represented  by  a  point  in  this  space;  or 
alternatively  by  a  vector  from  the  origin  0  to  this  point. 
This  leads  to  one  point  Y,  corresponding  to  the  left  hand  depen- 
dent variable  of  the  equation;  to  G-l  points  Y  ,  . . .  ,Y  corre- 

2  G 

sponding  to  the  right  hand  dependent  variables;  and  to  K  points 

X  , . . . ,X     ,...,X     corresponding  to  the  predetermined  variables. 
1  K*  K 

The  first  stage  of  two-stage  least  squares  amounts  to 

replacing  Y_,...,Y„  by  their  reduced  values.     This  gives  G-l 
2  G 

points  Y     ,...,Y      which  are  the  projections  of  Y  ,  .  .  .  ,Y 
2 "'             G  *  2  G 

respectively,  in  the  K  dimensional  plane  determined  by  the  K  +  1 

points  0,  X  ,...,X  .     In  Pig.   1  this  is  illustrated  for  the 
1  K 

case  T=3,  G=2,  K=2. 

The  second  step  is  the  application  of  ordinary  least 
squares  with  the  left  hand  Y,  as  dependent  variable  and  the 


(11) 


reduced  right  hand  Y's  and  K*  of  the  X's  as  independent  vari- 
ables.    This  implies  projecting  Y-^  onto  the  (G-l  +  K*)  dimen- 
sional plane  determined  by  0,         ,  •  •  •  ,  YqX.  ,  Xp  .  .  .  3X^X,  which 
leads  to  a  point  y-^.     After  this,  the  decomposition  of  the 
vector  0y#2  in  terms  of  the  vectors  OYj * ,  .  .  .  ,OX^-x-  gives  the 
estimated  coefficients  according  to  two-stage  least  squares. 


0 


Figure  1.     GEOMETRICAL  ILLUSTRATION  OF  TWO-STAGE 
LEAST  SQUARES 

AN  EVALUATION  OF  THE  METHOD 

VJhile  the  two-stage  least  squares  estimator  may  be 
shown  to  be  asymptotically  unbiased,  consistent  and  minimum 
variance,  this  does  not  give  an  accurate  picture  of  their 


(12) 


performance  when  working  with  small  samples  of  data.  The 
advent  of  the  computer  has  made  it  possible  to  conduct  Monte 
Carlo  studies  on  the  small  sample  properties  of  the  estimators 
in  relation  to  other  available  methods.     The  sample  size  was 
generally  chosen  in  the  range  of  15  to  40,  to  reflect  the  sample 
sizes  which  are  typically  found  in  practice. 

The  Monte  Carlo  studies  by  Basmann ^ ( 1961 ) ,  Nagar  (I960), 
Summers  (1965),  and  Wagner  (1958)  give  insight  on  the  choice 
of  the  best  method  of  estimation  for  structural  parameters  under 
the  restriction  of  small  sample  sizes.     The  evidence  from  these 
studies  appears  to  indicate  that  the  full-information  maximum- 
likelyhood  method  is  the  best  available.     However,  it  has  serious 
disadvantages.     The  computational  burden  is  very  heavy  and  the 
optional  properties  of  the  estimator  depend  heavily  upon  the 
correctness  of  the  specification  of  the  model.     In  light  of 
these  disadvantages,  this  method  is  not  considered  to  be  of 
practical  use. 

Of  the  remaining  methods  available  two-stage  least 
squares  becomes  the  best  method  which  can  be  practically  applied 
to  a  system  of  linear  equations.     While  the  Basmann  study  shows 
the  method  to  be  superior  by  a  more  pronounced  margin,  all  of 
the  studies  indicate  the  pref erability  of  two-stage  least  squares. 

A  criticism  of  two-stage  least  squares  given  by  G.C  Chow 
(1964)  is  that  the  choice  of  a  dependent  variable,  say  Y  ,  for 
the  first  equation,  etc.,  in  the  second  stage  seems  arbitrary. 
The  estimates  will  differ  according  to  the  choice  made.  In 
other  words ,  it  has  not  yet  been  specified  in  which  directions 


(13) 


the  sum  of  squares  should  be  minimized  in  the  second  stage. 
Also  a  second  criticism  is  that  the  method  does  not  adequately 
take  into  account  the  interdependence  of  the  ei  in  different 
equations . 

While  it  is  generally  conceded  that  this  method  is 
not  the  ultimate  for  finding  the  estimates  under  all  conditions, 
it  is  the  most  universally  applicable  method  and  certainly  the 
shortest  computationally  for  the  validity  of  the  results. 
Until  a  better  method  is  developed  it  is  certain  more  and  more 
applications  will  be  found  for  two-stage  least  squares. 

EXAMPLE 

The  following  illustrates  how  to  find  the  estimates  of 
coefficients  using  the  two-stage  least  square  method.  The 
model  used  in  that  of  the  Girshick-Haavelmo  economic  model  com- 
posed of  the  following  structural  equations : 

(1)  ylt  =  B12  y2t  +  B13  y3t  +  Y18  X8t  +  Y19  X9t  +  Y10  +  eit 

(2)  y2t  =  g22  y2t  +  62ij  y1|t  +  y 2Q  xQt  +  y2q  +  e£t 

(3)  y3t  -  y37  x?t  +  y39  x9t  +  y30  e3t 

(4)  y4t  =  345  y5t  +  Y46  x6t  +  Y48  x8t  +  Y4o  +  e4t 

(5)  y5t  =  B52  y2t  +  y58  x8t  +  y50  +  e5t  . 

t  =  1,2,    ....   20  sample  observations, 
where    y.    ,  i  =  1,...,53  denote  the  endogenous  variables 
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x     ,      J  =  6,...,9j  denote  the  exogenous  variables 

3        i,j  =  l,...,5a  denote  the  coefficients  of  endogenous 
variables 


Y     ,      k  =  6,... ,9,  denote  coefficients  of  exogenous 
variables 


denotes  the  intercept  of  the  1th  equa- 
tion 


It  is  assumed  the  endogenous  variables  are  jointly  distributed 
according  to  the  following  reduced  form  equations: 

(6)  yit  =  ni6X6t  +         +  n  9*9t  +  "io  +  "it 

i  =  1,...,5 

GENERALIZED  CLASSICAL  ESTIMATES 
TABLE  I* 

(Table  continued  on  following  page) 
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TABLE  I*  (continued) 


*     The  following  series  were  used  for  the  model: 

y     is  food  consumption  per  capita  published  by  the  Bureau  of 
Agricultural  Economics.     (An  adjustment  has  been  made  in  the 
official  series  for  1931!  to  exclude  the  quantity  of  meat  pur- 
chased by  the  Government  for  relief  purposes  and  distributed 
through  noncommercial  channels.) 

y?  is  retail  prices  of  food  products  (BAE) ,  deflated  by  the 
Index  of  Consumer  Prices  for  Moderate  Income  Families  in  Cities, 
published  by  the  Bureau  of  Labor  Statistics'. 

y    is  disposable  Income  per  capita  (Dept.'  of  Commerce),  de- 
flated by  the  BLS  Consumer  Price  Index. 

y,  is  production  of  agricultural  food  products  per  capita 
(BAE) . 

ys  is  prices  received  by  farmers  for  food  products  (BAE) , 
deflated  by  BLS  Consumer  Price  Index. 

x6  =  y  t_1  is  prices  received  by  farmers  for  food  products, 
lagged  one'year« 

x7  is  net  investment  per  capita,  i.e.,  disposable  income 
minus  consumers'  expenditures,  based  on  Dep.  of  Commerce  data, 
deflated  by  BLS  Consumer  Price  Index. 

xg  =  t  is  Time. 

x9  =  ^3  t-1  is  disposable  income  per  capita  lagged  one  year. 

All  the  data  are  expressed  in  terms  of  index  numbers 
(1935~39  =  100)  except  for  time,  xg,  which  has  the  values, 
1,2,..., 20.     The  analysis  covers  the  period  1922  through  19*11. 


(a)     Preliminary  Computations 
Compute : 

20 

M      =     £       (y.<_  -  y,)   (y..   -  y, )  1=1,2,4,5 


yy  t=1 


it        i        it  l 


where     _  T=20 

i     i  t=1  it 
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This  will  form  a  matrix  of  sums  of  squares  and  crossproducts 
of  deviations  over  all  the  y's  appearing  in  all  the  equations 
to  be  estimated.     In  this  example  it  is  required  to  estimate 
only  equations  (2),   (4)  and  (5).     Since  y3t  does  not  appear  in 
any  of  these,  the  sum  of  squares  and  crossproducts  need  not 
be  taken. 

From  the  data  in  Table  I: 


yy 


151.7295        62.4765      180.8860  226.1155 
583.2285      108.8320  1231.2685 
391.6480  320.2040 
3164.9495 


Compute 


y>; 


20 

£  (y, 
t=i 


it 


V  (\t  -  V 


0=1,2,4,5 
K=6,7,8,9 


257.7885 
920.2995 
364.6480 


1793.9165 
1870 .0155 
2073.5520 


2169.4165  6297.5185 


72.2500  401.6250 

-257.7500  430.5750 

-172.7000  306.1900 

-306.8500  1290.2350 


Compute 


20 

M  =  E  (x  -x  )  (x  -  x  ) 
xx       t=l      kt       K        nt  n 


k,n=6,7,8,9 


3071.7255        3963.8095      415.2500  1714.9250 
32367.3055      658.1500  4956.0350 
665.0000  317.7000 
2067.0700 
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Compute  by  means  of  the  forward  Doolittle  method,  the  matrix 

yy         yx     xx  yx 

(b)     Computation  for  a  Single  Structural  equation. 

For  expository  purposes  suppose  it  is  required  to  com- 
pute estimates  of  the  coefficients  3±J.  and  ylk  appearing  in  the 
structural  equation. 

(7)  -yit +  J2  fjt6ij  +  J,  xkt^ik +  cit  - 0 

The  variable  y1  appears  with  coefficient  3-^  =  -1. 
Let  yA  denote  the  vector  of  H  -  1  <  G  endogenous  variables 
(except  y^)  appearing  in  (7)  with  non-zero  coefficients,  and 
let  xA  denote  the  vector  of  K*  <  K  exogenous  variables  appearing 
in  (7)  with  non-zero  coefficients.     It  is  assumed  that  necessary 
condition  for  the  identification  of  (7)  are  met;  i.e.  K  _>  K*  +  H. 
Define : 

M  as  the  submatrix  of  M      involving  only  the  sums  of  cross 

y AxA  yx 

products  of  endogenous  and  exogenous  variables  appearing  with 
non-zero  coefficients  in  (7). 

M  as  the  submatrix  of  M      involving  only  the  sums  of  squares 

xAxA  yx 

and  cross  products  of  the  exogenous  variables  appearing  with 
non-zero  coefficients  in  (7). 

M*  as  the  submatrix  of  M"x       corresponding  to  the  endogenous 

yAya  yy 

variables  appearing  among  the  y  . 

M*       .  as  the  1  x(H-l)  submatrix  of  M*       corresponding  to  y-, 

yiyA  yy  1 

and  y . 
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M  ,       as  the  1  x  K*  submatrix  of  M      involving  only  the  sums 
ylxA  yx 

of  cross  products  of        and  the  endogenous  variables  appearing 

with  non-zero  coefficients  in  (7). 

MylyA  as  the  corresponding  submatrix  of  Myy • 


Form  the  compound  matrices 


M";; 


yAyA 


yAXA 


yAxA  xAxA 


which  is  square  of  order  H-l  +  Kx  and 


M  ,     =     [M«  -,  M  .      ]      ,     1  x   (H-l  +  K*) 

ylz  ylyA  ylxAJ  3 


Arrange  the  above  matrices  for  Doolittle  computation 


[S4][HW[IH-1+K*] 


and  compute  the  vector  of  sample  estimates,   (bA>  of  non" 

zero  coefficients  g^3  and        appearing  in  (7)3  where 


A  ylz 


Compute  the  sample  variance  w.^  of  the  residual  e.^ 
in  equation  ( 7 ) : 
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a  =  T*iKyl-  2  M'yAyl  bA  +  2  MylxA  MxA^  M'yAxAbA 


+  b  '   TM  -M  M      1  lb} 

D     L  yAyA        y  Ax  A    xaxa    y  a  x  a  Aj 


where  T*  =  T  -  K  -  1 . 

Compute  the  sample  varlance-covarlance  matrix  of  the 
estimated  coefficients  b&  and  cA 


Var  -  Cov  (bA,  cA)  =  u-q  S 


To  illustrate  the  numerical  application  of  the  steps 
outlined  in  Section  (b),  the  method  is  applied  to  Equation  2. 
The  appropriate  Doolittle  layout  is  exhibited  in  Table  II. 


TABLE  II 

342.036418    201.496935  -257.75OO 
221.303177  172.7000 
665.0000 


97.187553 
115.531281 
72.2500 


10  0 
0  10 
0  0  1 


Results  of  Doolittle  computations  applied  to  Table 
II  are  exhibited  In  Table  III  along  with  the  corresponding 
estimated  standard  deviation  given  in  parenthesis.     The  sample 
variances  of  residuals  have  been  computed  according  to  Section 
(b)  and  are  exhibited  in  the  last  column. 


TABLE  III 
(Table  continued  on  following  page) 
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TABLE  III 
ESTIMATES  OF  EQUATION  (2) 

y2  y4  X8 

0.1633  0.6366  0.3372 

(.0997)  (.1168)  (.0545) 
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The  two-stage  least  squares  method  of  linear  estimation 
of  coefficients  was  developed  to  replace  existing  methods  by 
providing  a  method  of  more  general  applicability  while  being 
less  expensive  to  apply.     The  basic  idea  of  two-stage  least 
squares  is  to  apply  simple  least  squares  to  the  entire  system 
of  predetermined  variables  to  obtain  estimates  for  the  depend- 
ent variables.     Then,  with  these  estimates  substituted  in  the 
system,  simple  least  squares  may  be  applied  again  to  a  particular 
equation  or  a  set  of  equations  of  the  system  to  estimate  the 
required  parameters. 

A  typical  set  of  equations  of  a  linear  system  con- 
taining G  dependent  variables,  y±t    and  K  predetermined  vari- 
ables, may  be  written  as 

H  K* 
ylt  =  y±t  +  x.t  +  elt,  t=l,...,T 

where        H-l  <_  G ;     K*  <  K 

In  addition  to  the  usual  assumptions  on  the  random  error  vector, 
the  assumption  of  K  >  K*  +  H-l  will  assure  a  solution  of  the 
system    by  the  two-stage  least  squares  method. 

By  successive  application  of  simple  least  squares  the 
parameter  estimates  are 


Where  V  denotes  the  matrix  of  reduced-form  residuals  found  in 
the  first  application  of  simple  least  squares.     These  estimators 
are  asymptotically  unbiased,  consistent  and  minimum-variance. 

Several  Monte  Carlo  studies  have  been  made  to  deter- 
mine the  performance  of  the  two-stage  least  square  method  when 
working  with  small  samples  of  data.     The  studies  made  to  date 
indicate  two-stage  least  squares  is  the  best  available  method 
which  can  be  applied  practically  to  a  system  of  linear  equations. 
While  it  is  generally  conceded  that  this  method  is  not  the  ulti- 
mate for  finding  the  estimates  under  all  conditions,  it  is  the 
most  universally  applicable  method  and  certainly  the  shortest 
computationally  for  the  validity  of  the  results. 


